Hindsight Network Credit Assignment: Efficient Credit Assignment in Networks of Discrete Stochastic Units

نویسندگان

چکیده

Training neural networks with discrete stochastic variables presents a unique challenge. Backpropagation is not directly applicable, nor are the reparameterization tricks used in continuous variables. To address this challenge, we present Hindsight Network Credit Assignment (HNCA), novel gradient estimation algorithm for of units. HNCA works by assigning credit to each unit based on degree which its output influences immediate children network. We prove that produces unbiased estimates reduced variance compared REINFORCE estimator, while computational cost similar backpropagation. first apply contextual bandit setting optimize reward function unknown agent. In setting, empirically demonstrate significantly outperforms REINFORCE, indicating reduction implied our theoretical analysis significant and impactful. then show how can be extended more general outputs network units, where known version train variational auto-encoder it compares favourably other strong methods. believe ideas underlying help stimulate new ways thinking about efficient assignment compute graphs.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Credit Assignment through Time : Alternatives

Learning to recognize or predict sequences using long-term context has many applications. However, practical and theoretical problems are found in training recurrent neural networks to perform tasks in which input/output dependencies span long intervals. Starting from a mathematical analysis of the problem, we consider and compare alternative algorithms and architectures on tasks for which the ...

متن کامل

The Social Credit Assignment Problem

Social credit assignment is a process of social judgment whereby one singles out individuals to blame or credit for multi-agent activities. Such judgments are a key aspect of social intelligence and underlie social planning, social learning, natural language pragmatics and computational models of emotion. Based on psychological attribution theory, this paper presents a preliminary computational...

متن کامل

Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks

A major drawback of backpropagation through time (BPTT) is the difficulty of learning long-term dependencies, coming from having to propagate credit information backwards through every single step of the forward computation. This makes BPTT both computationally impractical and biologically implausible. For this reason, full backpropagation through time is rarely used on long sequences, and trun...

متن کامل

Credit assignment in movement-dependent reinforcement learning.

When a person fails to obtain an expected reward from an object in the environment, they face a credit assignment problem: Did the absence of reward reflect an extrinsic property of the environment or an intrinsic error in motor execution? To explore this problem, we modified a popular decision-making task used in studies of reinforcement learning, the two-armed bandit task. We compared a versi...

متن کامل

A Fuzzy CMAC Neural Network Model Based on Credit Assignment

In order to improve online learning speed and accuracy of CMAC, a fuzzy CMAC neural network model based on credit assignment concept is designed. In the conventional CMAC and fuzzy CMAC learning scheme, the corrected amounts of errors are equally distributed into all addressed hypercubes, regardless of the credibility of those hypercubes values. The proposed improved learning approach is to use...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i8.20874